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We exploit (co)inductive specifications and proofs to approach the evaluation of low-level programs 
for the Unlimited Register Machine (URM) within the Coq system, a proof assistant based on the 
Calculus of (Co)Inductive Constructions type theory. Our formalization allows us to certify the 
implementation of partial functions, thus it can be regarded as a first step towards the development of 
a workbench for the formal analysis and verification of both converging and diverging computations. 

1 Introduction 

In this paper we report and discuss a formalization of the Unlimited Register Machine (URM) and its 
semantics within the Calculus of (Co)Inductive Constructions (CC5 ' ). 

The URM is a mathematical idealisation of a computer, one of the formal approaches to characterize 
the intuitive ideas of computability and decidability lfl2l . Programs for the URM are low-level, essen- 
tially assembly-like, and their execution gives rise to both converging and diverging computations. This 
is a typical situation where it is required to define and reason about circular, potentially infinite objects 
and concepts, i.e. systems with infinitely many states. Since structural induction trivially fails on these 
systems, one may resort to stronger approaches, such as, among other ones, coinduction. 

Coinductive principles can be stated and exploited in different settings. From a set-theoretical 
standpoint coinduction arises when objects are viewed as maximal fixed-points of monotone operators, 
whereas the categorical approach is developed through (final) coalgebras. To develop the present work, 
we settle within the logical system of Intuitionistic Type Theory. 

Actually, in intuitionistic type theory infinite objects are managed through coinductive types: these, 
roughly speaking, are collections of elements whose construction requires an infinite numbers of steps. 
In particular, a handy technique for dealing with coinductive definitions and proofs within cC^ Co ^ Ind 
was introduced by Coquand [8] and refined by Gimenez [17]. Although providing a limited form of 
coinduction, such an approach is particularly appealing, because proofs carried out by coinduction are 
accommodated as any other infinite, coinductively defined object. Remarkably, such a technique is 
mechanised in the system Coq [26]: this, one among the rare interactive environments that implement 
coinductive definition and proof principles, is an appreciated proof assistant, due to the fact that the 
automatization and the interaction with the user are well-balanced. 

In this paper we formalize the URM and its semantics from the point of view of the program cer- 
tification. In our opinion, such an encoding within a coinductive formal system, such as cC (Co - )Ind , has 
several benefits. First it is interesting per se, as experiments about the encoding of computability models 
are still lacking. Then it may be valuable in education, by giving the opportunity to undergraduate stu- 
dents (computability is actually a basic computer science course) to experiment with non-standard (i.e. 
coinductive) tools within a concrete, relatively simple application. Further it might be useful in the area 
of program transformations, because the formal treatment of low-level languages is mandatory to certify 
components of programming languages, such as type-checkers, interpreters, and compilers. Last but not 
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the least, the present, novel theoretical case study witnesses the broad applicability of coinduction as a 
verification technique on infinite-state systems and the significance of its mechanisation. 

Besides the points mentioned above, we claim that the originality of this paper relies also on the 
presentation of the encoding, which is illustrated and discussed without showing Coq code, but via the 
more abstract level of cC (Co ^ Ind (in any case, the Coq code is available to the interested reader at the web 
page of the author [7]), thus providing the reader with an extra pedagogic value. 

In the next section we illustrate coinduction within cC (Co)Ind ; then in the following four sections we 
develop the formalization of the URM, dealing with programs, computations and functions; finally we 
discuss directions for further investigations in the light of what we achieve and of related work. 

2 Coinduction in cC (Co)Ind 

The formal treatment of infinite objects and concepts is supported by cC (Co)Ind via the mechanism of 
coinductive types. These, by providing the user with a limited form of recursion, allow the formalization 
and the management of infinite data and infinite proofs. 

First of all, one may define concrete, infinite objects {i.e. data) as elements of coinductive types, 
which are fully described by a set of constructor^. From a pure logical point of view, the constructors 
can be seen as introduction rules; these are interpreted coinductively, i.e. they are applied infinitely many 
times, hence the type being defined is inhabited by infinite objects: 

ses ses 

(05). (IS). 

0:s € S UseS 

In this case we have formalized infinite sequences, i.e. streams, of bits, a coinductive type we name S. 
Optionally, coinductive types may contain finite objects too, that is, potentially infinite objects; in such a 
case also constant constructors, besides the recursive ones, have to be declared: 

leL leL 

(0L) (li) (0L). (1L). 

OGL leL 0:1 eL 1:1 eL 

So doing, we have defined L, the type of sequences of both finite and infinite length, i.e. lazy lists, of bits. 

Once a new coinductive type is defined, the system provides automatically the destructors, i.e. an 
extension of the native pattern-matching capability, to consume the elements of the type itself. Therefore, 
coinductive types can also be viewed as the largest collection of objects closed w.r.t. the destructors. 

Consistently with this intuition, the destructors cannot be used for defining functions by recursion 
on coinductive types, because their elements cannot be consumed down to a constant case. The natural 
way to allow self-reference is to consider the dual perspective of building individual, constant elements 
in coinductive types. Such a goal can be fullfilled through lazy corecursive functions: 

zeros = 0:z,eros 

odd{s) = match s with a:b:s' => a:odd{s') 

even(s) = match s with a:b:s' => b:even{s') 

merge(s,t) = match s with a:s' ^> match t with b:t' ^> a:b:merge{s' ,t') 

Corecursive functions produce infinite objects and may have any type as domain (note that in the last 
three definitions we have applied the match destruction operation on a parameter of the domain). Infinite 

'The constructors must respect a strict positivity constraint condition to guarantee the reduction termination of the calculus. 



A. Ciaffaglione 



51 



objects are not unfolded, unless their components are explicitly needed, "on demand", by a destruction 
operation. Therefore, to prevent the evaluation of corecursive functions from infinitely looping, their 
definition must satisfy a guardedness condition: every corecursive call has to be guarded by at least one 
constructor, and by nothing but constructor^. This way of regulating the implementation of corecursion 
captures the intuition that infinite objects are built via the iteration of an initial step. 

Given a concrete coinductive type (such as S and L above), no proof principle can be automatically 
generated by the system: in fact, proving properties about infinite objects requires the potential of build- 
ing proofs which are infinite as well! What is needed is the design of ad-hoc coinductive predicates, i.e. 
coinductive propositions, which are actually inhabited by such infinite proof^\. The traditional example 
is point-wise equality (also known as bisimilarity), that we define on streams and name ~ C 5 x 5: 

6e{0,i} s~t 

H~ 

b:s ~ b:t 

Two streams are bisimilar if we can observe that they have equal heads and recursively, i.e. coinductively, 
their tails are bisimilar. Once this new predicate is defined, the system provides the corresponding proof 
principle, to carry out proofs about bisimilarity: such a tool, named guarded induction principle ll8l[T7l. 
is particularly appealing in a context where proofs are managed as any other infinite object. 

In fact, a proof by guarded induction is just an infinite object built by lazy corecursion (hence it 
must respect the same guardedness constraint that lazy corecursive functions have to). Remarkably, the 
mechanization of the guarded induction principle provides a handy technique for the construction of 
infinite proofs, which can be carried out interactively through the cof ix tactic^. This tactic allows to 
build infinite proofs as infinitely regressive proofs, by assuming the thesis as an extra hypothesis and 
using it carefully later, provided its application is guarded by constructors. This "internal" approach is 
very direct, compared to the traditional techniques based on bisimulations, because the proofs do not 
need to be exhibited beforehand, but can be built incrementally via tactics. 

To illustrate the support provided by the cof ix tactic, we pick out the following coinductive property: 

\/s£S. merge(odd(s), even(s)) ~ s 

We prove this proposition by mimicking the top-down proof practice of cc( c °) Ind First, the coinductive 
hypothesis is assumed among the hypotheses and the stream s is destructed two times into a:b:t; then 
the corecursive functions odd, even and merge, in turn, may perform a computation step; finally the con- 
structor (^)oo is applied twice. In the end, we have reduced the goal to prove merge(odd(t), even{t)) ~ t, 
a proposition which is an instance of the coinductive hypothesis. Therefore one is eventually allowed to 
exploit the coinductive hypothesis itself, whose application is now guarded by the constructor (^)oo. The 
application of the coinductive hypothesis completes the proof, and intuitively has the effect of repeating 
ad infinitum the explicit, initial proof segment, thus realizing the "and so on forever" motto. 

To avoid ambiguity with genuine induction, we say that the proof has been performed by structural 



2 Syntactically, the constructors guard the recursive call "on the left". 

3 This distinction between concrete objects and proofs points out that sets inhabited by concrete objects have computational 
content, whereas predicates inhabited by proofs carry logical information. 
4 A tactic is a command to solve a goal or decompose it into simpler goals. 
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coinduction on the derivation. The whole proof may be displayed in natural deduction 

styleEl 

as follows: 

[merge(odd(t), even(t)) — ?](i) 

(=0- 

a:b:merge(odd(t), even{t)) ~ a:b:t 

(computation: merge) 

merge(a:odd(t), b:even(t)) ~a:b:t 

(computation: odd, even) 

merge(odd(a:b:t), even(a:b:t)) ~ a:b:t 

(destruction) 

merge(odd(s), even(s)) ~ s 

(introduction) 

Vs€»S. merge(odd(s), even(s)) ~ s 

(i) 

Vj&S. merge(odd(s), even(s)) ~ s 

To conclude, we observe that, as the reader may imagine, there exist several semantically productive^, 
but syntactically non-guarded functions (and proofs) that cannot be accepted by cC^ Co ^ Ind , because the 
automated check is not sophisticated enough. Particular effort is put in fact by the community into the 
goal of extending the expressive power of guarded corecursion |fT9l[T6l l5ll. At the moment, we can say 
that cC^ Co)Ind has made a lot of progress, but there are still problematic issues on the carpet. 



3 The Unlimited Register Machine 

The Unlimited Register Machine (URM) is a mathematical idealisation of a computer, one among the 
frameworks proposed to set up a formal characterisation of the intuitive ideas of effective computability 
and decidability. It is equivalent to the alternative approaches, e.g. Turing machines, and particulary 
valued for its simplicity. We work here with the URM formulation introduced by Cutland |[T2l . a slight 
variation of a machine first conceived by Shepherdson and Sturgis |[23l . 



Registers and instructions. The URM has an infinite number of registers R\,R2,... containing natural 
numbers ri , r%, . . . which may be altered by instructions. These are of four kinds and have the following 
intended meaning (r—tR represents the loading of the natural value r in the register R): 



z(o 

S(i) 

T(i,j) 
J(i,j,k) 



Zero 

Successor 

Transfer 

Jump 



O^Ri 

n^Rj 

if r,= rj then proceed from the kth instruction 
else proceed from the next instruction 



Programs and computations. A program for the URM is a finite, non-empty sequence of instructions. 

When provided with a program P and a(n initial) configuration {i.e. & finite, non-empty sequence of 
natural numbers r\,T2, ■ ■ ■ ,r m in the registers R\,R2, ■ ■ ■ ,^ m fl the URM performs a computation: this 
means starting from the first instruction in P and obeying the instructions sequencially (unless a Jump is 
encountered), thus altering at any step the content of the registers as prescribed by the instructions. 

5 As usual, local hypotheses are indexed with the rules they are discharged by. 
^Productivity is the power of a function call to produce data, which is undecidable. 

7 Despite the number of the registers being infinite, any program P is finite, so there exists a maximal register index m=p (P), 
depending on P, such that R m is affected by the instructions in P. Hence r\ , r-i, ■ ■ ■ , r m is equivalent to r\ , rj, . . . , r m , 0, 0, . . . 
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The computation stops, or converges, if and only if there is no next instruction; when this is the case, 
the number r stored in R\ in the, final configuration is regarded as the output of the computation, and this 
is written P(r\ , ri, . . . , r m ) \.r. On the other hand, due to the looping back via the Jump instruction, there 
are computations that never stop, or diverge, which is written P(n,r2,. . . ,r m ) f. 

Formalization in cC (Co)Ind . The encoding of the basic URM structures in CC (Co)Ind is straightforward, 
because both configurations and programs are simply finite, non-empty sequences of components, which 
we formalize by means of inductive datatypes (the N represents the natural numbers): 



hoc 


i,j 


e 


N + =N-{0} 


register index 


Val 


r 


e 


IN 


register content 


Cgn 


a 




(lHr,) ie[Lm| 


list-configuration 


PC 


k,h 


G 


N 


program counter 


Inst 


I 


G 


{Z(i), S(i), T(iJ), J(i,j,k)} 


instruction 


Pgm 


U,V 




^ It y .€[!..!.] 


program 



An alternative encoding of configurations can be given via infinite sequences, i.e. coinductive datatypes: 
Cgn„ : Ooc '.'.= (li — ^r t stream-configuration 

Adequacy (I). We start to address now the faithfulness of our encoding of the URM, by comparing 
Cutland's formulation and our formalization in CC (Co)Ind . First, we observe that the syntax of our in- 
structions (and therefore of programs) coincide with Cutland's one. Then, two technical points have to 
be considered: about the convergence of computations, and about the encoding of configurations. 

The "natural" way for the program U =I\,l2,. ■ . ,I n to stop is that the program counter is set eventually 
to n+ 1 ; though, a Jump instruction could set it to an index greater than n+ 1 . Cutland actually confines 
his attention to the programs that invariably stop because the next instruction should be I n+ \. We adopt 
a similar convention here, with the difference that we use the index in place of n+ 1 : these kinds of 
programs, the sole we will be considering from now on, are said to be in standard form. 

Definition 3.1 (Standard form) 

A program C/=(li-)-/ l ) l6 [ 1 " n J is in standard form if for every J(i,j,k)€U, k<n holds. 

As far as the formalization of configurations is concerned, it is apparent that our stream-configurations 
(i.e. the datatype Cgn^) correspond to infinite sequences of registers in the original URM. 

By working on paper, on the one hand, Cutland is naturally allowed to define configurations as finite, 
starting segments of such infinite sequences of registers: in fact, by inspecting a given program P, one 
can pick out p(P), the maximal register index affected by the instructions in P. In this way the working 
space available to the computation under P may be restricted to the configuration r\,r2, ■ ■ ■ ,r p ( P y 

On the other hand, working formally within cC^ Co)Ind requires extra care. First we observe that our 
list-configurations (i.e. the datatype Cgn) correspond to the above Cutland configurations n, r2, . . . , fp(p)- 
Nevertheless, list-configurations bring a drawback: if one wants to reason formally on them, it is required 
to consider only programs that respect the working space they make available^. That is, programs and 
list-configurations can be soundly coupled just if the programs contain "good" pointers (i.e. indexes) to 
the configurations themselves, a constraint that can be viewed as a kind of compatibility concept. 

8 In a sense, this means to provide in advance with the maximal register index p(U), given a program U. 
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Definition 3.2 (Compatibility) A program U and a list-configuration <T=(li — ^r,)' 6 ! 1 --" 1 ] are compatible 
(a\=U)ifU is in standard form and, for every Z(i), S(i), T(i,j), J(i,j,k)€lf, i,js[l..m] holds. 

4 Abstract computation 

In this section we bootstrap the semantics of the URM, by extending in a modular way the formalization 
introduced so far; note that, from now on, we will use the terminology "configuration" to refer to the 
encoding in cC^ Co ^ Ind itself (i.e. either the finite-list datatype Cgn or the infinite-stream datatype Cgnoo). 

It is apparent that the concept of convergence of computations can be relativised w.r.t. configurations: 
there are actually programs that always stop and programs that never stop (whatever configuration is 
coupled to them) and programs that either converge or diverge depending on the initial configuration. 
Clearly, the divergence is caused by the presence of infinite loops in the progress of computation: to 
deal formally with the execution of programs we have then to manage an infinite-state system, a scenario 
which may benefit from the use of the coinduction as a specification and proof principle. 

In this section we focus just on a restricted, basic notion of computation: in fact, from the point of 
view of the termination, the only essential instruction is the Jump instruction, which has the capability to 
separate converging computations from diverging ones. Hence we consider here programs that contain 
only Jump instructions, i.e. abstract programs; this preliminary investigation allows us to focus on the 
object system from a cleaner perspective, to be exploited in the following. 

Noticeably, it is not possible to cope with the semantics of URM programs by using a unique, po- 
tentially coinductive computation concept (see Section 0): a faithful encoding has actually to reflect the 
separation between converging and diverging computations, through two different judgments. There- 
fore, using in this case finite (i.e. list) configurations, the semantics of abstract URM programs can be 
described by the inductive cpj + and the coinductive cpjoo predicates, whose arity is Pgm x Cgn x PC. 

Definition 4.1 (Abstract evaluation) Let A=(ih^I l ) ie ^ 1 --"^ and G=(lt->r l ) ie '- 1 " m i be an abstract program 
and a configuration such that G \= A, and let h 6 [l..n] and Ih=J(i,j,k). Then, cpj + is defined by the 
first four rules, interpreted inductively, andcp^ by the last two rules, interpreted coinductively: 

h=n r^rj k=0 rj=rj 

(/■/)+ (»■/)+ 

cp j+ (A,a,h) cp j+ (A,a,h) 

cpj + (A,a,h+l) h<n r^rj cpj + (A,G,k) k^O rj=rj 
(f-r) + (t-rh 

cp j+ (A,a,h) cp j+ (A,G,h) 

cpj<»(A,G,h+l) h<n r^rj cpjoo(A,G,k) k^O n=rj 

cpj^(A,o,h) cpj a o(A,a,h) 

At the moment, our goal is to capture just the progress of the control flow, with the computation 
that may proceed from a generic instruction of a program. Specifically, the intended meaning of the 
judgments cpj + (A, G,h) and cpj^(A, G,h) is that the computation under the abstract program A with the 
configuration a and by starting from the hth instruction of A, converges and diverges, respectively. 

More in detail, the coinductive predicate asserts that the computation loops: that is, by starting from 
the instruction 4, there exists an instruction L which can be reached from 4 and such that, afterwards, 
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the control flow comes again at I q after a non-zero, finite number of steps. Hence, the divergence is 
grasped via the predicate cp„ by the coinduction proof principle motto ("and so on forever"). 

We remark that, since URM programs are not structured, we have to embed in the encoding some 
other "structuration" criterium; in fact, the design of the predicates has been directly inspired by the num- 
ber of evaluation steps implicit amount. Thus we have defined two atomic rules for cpj + (the evaluation 
stops in one step), when either the current one is the last instruction and the Jump condition is false, or 
the current Jump condition is true and the instruction tells to jump out of the program. The extra rules are 
recursive, and address how an evaluation step is carried out within a converging computation (predicate 
cpj + ) and a diverging one (predicate cpjoo), again inspecting by cases the Jump condition. 

Another important choice to be pointed out is that we have modeled the evaluation from a particular 
perspective, i.e. for using the judgments, according to Coq's top-down proof practice, to execute specific 
programs. This "algorithmic" approach is motivated by the fact that we are interested in experimenting 
the certification of concrete programs; this is a preliminary step that pinpoints further investigations, such 
as the development of the metatheory of the URM or the advanced issues addressed by Leroy and Grail 
l22l . We are conscious that these more ambitious tasks could require the introduction of new versions of 
the evaluation concept, to be related to the ones we have formalized up to date. 

We notice, finally, that a fragment of the encoding of the evaluation judgments, which is common to 
all the rules, has not been displayed in the rules themselves, but has been collected within the hypothe- 
ses of the Definition 14.11 such a part of the formalization has to cope with the compatibility between 
programs and finite configurations, an overhead that we have discussed in the previous section. 

In the end, using our machinery we can manage termination and divergence of computations under 
abstract URM programs parameterically w.r.t. non-mutable configurations, as follows. 



Definition 4.2 ( Converging and diverging abstract evaluation) Let A and O be an abstract program and 
a configuration such that a \=A. The computation under A with a converges and diverges when: 

stopj(A,a) = cp j+ (A,a,l) 
loopj(A,a) = cpjoo(A,a,l) 

As an example, let us consider the abstract program fi=(li-4/(l,2,2), 2i-h/(l,2,2)). We can prove 
that the computation under B with the configuration cr=(li — >-0, 2i — >■ 1 ) converges, while it diverges with 
T=(li — >0, 2i — >-0); both the proofs are immediate, the second one is by coinductiorH: 

n =0^l=r 2 2/0 n=0=r 2 [cpj^B, T,2)] (1 ) 

(/•/)+ (t-rUl) 

ri=0^1=r 2 c Pj+ (B,a,2) 2^0 n=0=r 2 c Pj ^(B,z,2) 

(/•')+ ('■'•)- 

cp j+ (B,o,l) cp jo o(B,T,l) 

A more sensible approach would allow to manage variable configurations, such as ^i=(li-^m, 2*->n). 
In that case, the Definition 14.21 should be more involved, by including a premise to constrain the content 
of the configuration at hand. So doing, one could prove more general assertions, such as e.g. (m^n) 
cpj + (B,pi, 1) and (m=n) => c/j ;oo (B,/i, 1). Though, we prefer to postpone such versions of convergence 
and divergence to the next section, where we will address the full URM instruction suite. 



9 As discussed in Section[2] the proofs are displayed in natural deduction style and have to be read from the bottom. 



56 



A coinductive semantics of the Unlimited Register Machine 



5 Full computation 

We extend now our formalism to deal with the full URM, by adopting infinite {i.e. stream) configurations, 
because these allow to dispose of the compatibility between programs and configurations themselves (as 
argued in Sections [3] and HJ). Note that the results we get are independent from the particular encoding of 
configurations (in fact, at the end of this section we will relate formally finite and infinite configurations 
to each other, by addressing the adequacy of the whole formalization). 

Actually, the computation under URM programs is captured by the more involved inductive predicate 
cp + , with arity Pgm x Cgn^ x PC x Cgn^, and the coinductive predicate cp„, with arity Pgm x Cgn„ x 
PC, which describe both the control flow and its effect on configurations. 

Definition 5.1 (Evaluation) Let U={l^I l } 1 ^ 1 -^ and a 00 =(l^r [ ) ie[l H be a program and a configura- 
tion, and let h £ [l..n]. We assume that //,=/(/, j,k) in the Jump rules (those labelled {]—)), Ih=Z(i) in 
the Zero rules, Ih=S(i) in the Successor rules, and I/ 1 =T(i,j) in the Transfer rules. 
Then, cp + is defined by the following rules, interpreted inductively: 

h=n r^rj cp + (U, Ooo,h+l, Too) h<n r^rj 

- (//•')+ Uf-r)+ 



CP + (U ,OoO,h,Goo) CP + (U ,OoO,h, Too) 

k=0 rt=rj cp + (U ,<Jo„,k, Too) kj^O r{=rj 

■ {jt-l)+ (jt-r) 



cp + (U,Ooo,h,Ooo) cp + (U,Ooo,h,To< 

h=n Too=zr(<Too,/) cp + (U,Oo\,h+l,Too) h<n ai=zr(aoo,/) 

(*•/)+ (zt)+ 

<:/?+(£/, Coo, /z, Too) cp + (U,(Joo,h,Too) 

h=n r a o=sc(a oa ,i) cp + (U,aio,h+l,Too) h<n ai=sc(aoo,z) 

CP + (U, O^, ft, Too) Cp + (U, Ooo, h, Too) 

h=n Too=mv(Ooo,i,j) c/>+(t/,0'!,/i+l,T 00 ) h<n G , 00 =mv{a^iJ) 

(»•/)+ (t-r)+ 

CP + (U, Coo, ft, Too) Cp + (U,Ooo,h,Too) 

And cpoo is defined by the following rules (a superset of those for cpjoo), interpreted coinductively: 
cpoo(U,(Joo,h+\) h<n r\^rj cpoo(U,(Joo,k) k^O r{=rj 

cpoo(U,Ooo,h) cpoo(U ,Ooo,h) 

cp^U, Too,ft+l) h<n Too=zr(Ooo,i) cpo=(U, To»,ft+l) h<n Z e=sc(o 00 ,i) 

cpoo(U,Ooo,h) cpoo(U,Ooo,h) 

cpoo(t/,Too,ft+l) h<n Too=mv(Ooo,i,j) 

cp CO (C/,(Too,ft) 

77ie corecursivj^l functions zr,sc,mv : Cgnoo x IV + (xJV + ) — > Cgfioo aZter ffte configurations, as pre- 



°Corecursion is defined in Section[2] Note that these functions would be recursive working with finite configurations. 
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scribed by the instructions Zero, Successor and Transfer; the definition of zr is e.g. as follows I 
zr((Joo,i) = match a„ with r : T» =^ match i—l with => : T» | n+1 => r : zr(T>o,/— 1) 

The intended meaning of the judgment cp + (U, (7oo,/z, Too) is that the computation under the program 
U with the configuration Coo and by starting from the hth istruction of U, stops, transforming (Too into Too. 

On the other hand, the intended meaning of cpoo(U,(Joo,h) is the same as cpjoo, even if the config- 
urations may be updated, in the case: the computation under the program U with the configuration a«, 
and by starting from the hth istruction, loops. That is, there exists an instruction l q which can be reached 
from //, and such that, afterwards, the control flow comes again at I q after a non-zero, finite number of 
steps. Nevertheless, the use of cpoc is subtler than that of cp^: the coinductive hypothesis ("and so on 
forever") may be actually applied, to grasp the divergence, provided the configuration at hand satisfies 
an invariant (whose nature will be clarified below). Coherently with such an intuition, a final configura- 
tion (corresponding to the fourth parameter of the inductive predicate cp + ) cannot exist for c/?oo, simply 
because the configurations may be updated "ad infinitum" in the course of a diverging computation! 

Termination and divergence are now fully significant, and managed parameterically as follows. 

Definition 5.2 ( Converging and diverging evaluation ) Let U and be a program and a configuration, 
and let ^((Too, £/), J"{Coo,U) be decidable constraints about the content of the registers in (Too, depending 
on U. Then, the computation under U with G, XJ converges and diverges when, respectively: 

stop(U,Ooo) = 3Too, 3^(cJoo,£/). .^{Ooo^U) ^cp + (U, CToo,l, T*,) 
loop{U,Ooo) = BJ^iOocU). Jip^.U) =>c/?oo(£/,<7oo,l) 

As foreseen by the above comments about cp + and cp x , the management of convergence and diver- 
gence are fairly different between each other, when the configurations can be updated by computations. 

Converging computations under U with initial (Too are actually accommodated in the intuitive way: 
the halting is described by the program counter, which is eventually set to 0; moreover, the incremental 
modification of doo is reported in the final Too- The premise £?(Oco,U) plays the role of a termination 
condition, which, if needed, provides with the extra potential of carrying out proofs by induction. In fact, 
computations may converge essentially in two ways: with or without the presence of finite cycles. In 
the latter case, the constraint just "guides" the control flow to the end of the program; in the presence of 
cycles, it is exploited to pick out a parameter on which to reason by induction. Therefore, in our logical 
setting, program-driven termination constraints make feasible formal proofs about the convergence and 
the output of individual programs w.r.t. parameter configurations. In other words, such conditions allow 
to make formal the informal proofs by evidence that one may figure out by inspecting the programs. 

Conversely, the modification of the starting configuration doo within diverging computations under 
U does not produce a final configuration, because doo is updated ad infinitum. Though, the modification 
of Coo can be observed in the course of the computation, and such configuration may be checked against 
an invariance condition, that constrains its content. Therefore, the invariance condition J^{Ooo,U) itself, 
whose shape depends again on U, becomes the "guard" to ensure the non-terminatiorf^l. 

Concerning the termination and invariance constraints, we restrict to universally quantified formulas 
on natural numbers, built via the logical operators and the arithmetic operations and predicates. 

n We use here the notation r : (Too to represent the configuration (On-s-r, iH-s-ri) 16 ! 2, 00 ]. 

I2 We remark that the whole scenario is coherent w.r.t. the concept of computable function, that we will address in Section|6] 
there is an output, which is extracted from the final Too, if and only if a computation stops. 
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For the sake of illustrating the technical details, let us consider the parametric (i.e. variable-content) 
configuration jUoo=(li— >m, 2>-in, 3h- >p,...) and the program V=(l^S(l), 21-^/(2,3,1)). We can then 
show that the computation under V with ^ diverges, by choosing the invariant n=p (while it converges 
with the termination constraint n^p). To prove Mm,n,p. (n=p) cpodV,^, 1) by structural conduc- 
tion on the derivation within Coq's top-down proof environment, we assume in the proof context the 
coinductive hypothesis, the variables and the invariant; then we execute the two instructions of U so that 
the control flow loops back to the first instruction; finally we apply the coinductive hypothesi^l, which 
demands to prove that the new configuration satisfies the invariant constraint as well: 

[n=p] 

n=p 

[cpao(U, (li-Hn+1, 2i-m, 3i — . . . ) , 1 )] (i) 

C/f-0- 

cp^U, (li-»-m+l, 2t-^n, 3t->p,...),2) 

(w). 

cpco{U, (li-^m, 2(->n, 3i — >-/?, . . .), 1) 

(introduction) 

ym,n,p£N. (n=p) cp^U, (li->m, 2i->-n, 3i — . . . ) , 1 ) 

(i) 

Vm,n,p£N. (n=p) cp^(U, (lt-+m, 2t-^n, 3i->p, ...), 1) 

Adequacy (II). We complete now the discussion about the faithfulness of our encoding w.r.t. Cutland's 
URM [12], undertaken in Section[3j the issues we have to address formally are the relationship between 
finite and infinite configurations, and the semantics given in the current and the previous section. 
As far as the configurations are concerned, we first define the inclusion and restriction concepts. 

Definition 5.3 (Configuration inclusion/restriction) Let U be a program, a=(l\— s^) 16 ! 1 --" 1 ! a finite con- 
figuration and Too=(li— >ti) ie ^ 1 ■■ oc l an infinite one. Then, inclusion and restriction are defined as follows: 

o C Too = (Vie[l..m]./i=j,)A(Vl>m. h=Q) 

Too)!/ ^ (l^) ie[1 " P(t/)] ' 

Concerning the semantics, let us assume (without displaying the rules) to have introduced a second 
definition fot both the predicates cp + and cp x , to cope with finite configurations and for which we use 
an overloaded notation. The new rules differ from Definition [5j] only for the fact that the involved finite 
configurations require the extra compatibility constraint with programs, analogously to Definition 14. II 

Now we can state the equivalence between finite and infinite configurations encodings Cgn and Cgn^. 

Theorem 5.4 (Configurations equivalence) Let U=(li->I l ) ie ^" n ^ be a program, o and T finite configu- 
rations, Coo and Zoo infinite configurations, and let Ag[l..n]. Then the following properties hold: 

1. cp + (U,a,h,z) Aa\=U AaC(J„ ATCToo cp + (U, Goo, h, Too) 

2. cpoo(U,a,h) Aa\=U Acrca,*, => cp„(JJ ,a^h) 

3. cp + (U,Ooo,h,too) =4> cp + (U ,Coo\u,h,Xoo\u) 

4. cpoo(U,Ooo,h) => cpoo(U,Coo\u,h) 

13 The application of the coinductive hypothesis is guarded by the two constructors (s-r)^ and (jt-r)„ (see also Section|2j. 
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PROOF. (1, 3) By induction on the evaluation hypothesis. (2, 4) By coinduction on the derivation. 

Even if the above Theorem establishes that working either with finite, list-like configurations or with 
infinite, stream-like ones, is equivalent, we have preferred up to date to handle infinite configurations. 
Our choice is motivated by two reasons: stream configurations do not require the overhead of managing 
side-conditions to model the compatibility with programs, and it has not been yet necessary to perform 
proofs by induction on the structure of configurations themselves. 

In the end, the reader can see that our machinery provides the user with a logic for the URM, i.e. 
a formal system whose potential may be exploited to prove properties about the semantics of URM 
programs and the encoding itself, a direction we will comment on further in the final section. 

To consider the adequacy issue, we conjecture that our formalization internalizes faithfully the very 
initial theory developed by Cutland on paper, i.e. the part concerning the synthesis and the execution 
of individual programs. By addressing the task formally, the soundness of our encoding is apparent (as 
our programs coincide with Cutland's ones, and we have coupled to programs a formal logical system); 
moreover, we state a limited form of completeness, in the following sense. 

Conjecture 5.5 (Adequacy) Let P be an URM program and U=(i*— >/,-) IG [ 1 -"] its faithful encoding. Then: 

1. IfP(a\,a2,- ■ .,a m )lb, then there exist t=(1i-^, n-^) 1 ^ 2 ""'! and ^ ((l^-a l ) ie ^- m \U) such that 
3r((i^a^ l -- m \u)^cp + (U,{l^ai) l ^-- m \l,%) 

2. If P(ai,a 2 ,...,a m )^, then there exists J ' ((lh^a l ) l ^ 1 " m ^ ,U) such that J' ({l\-^ai) ie ^" m \U) 
c^(f/ ) (l^a l ) ie[1 " m] ,l) 

PROOF. (1) By inspection on the hypothetical evaluation (to devise the termination constraint, which 
depends on the initial configuration >-a i -) ie t 1 " m l ), then by induction (see also Section®. (2) By in- 
spection on the hypothetical evaluation (to devise the invariant), then by structural coinduction. 

To conclude, we remark that, after the introduction of the very basic computability theory, Cutland 
develops "higher-order" methods, to devise new computable functions without having to write programs. 
It is immediate that addressing this kind of adequacy, at the moment, is out of the scope of our approach. 

6 An example: partial minus 

The next step of our work is to address slightly more involved concepts: in this section we exploit the 
formalization developed so far, by tuning it to deal with the functions computed by the URM. 

The formal notion of (partial) computable function arises naturally in Cutland's presentation lfl2l 
after the preliminary definitions reported in Section [3] Namely, a program P computes a function 
/: lN m — N when, for every ai,a%,... ,a im b<EN m+l , the computation P(a\,a2,. . . ,a m ) stops and b is 
stored in the register R\ in the final configuration (this is written P(ai,a2, ■ ■ ■ ,a m )],b) if and only if: 



A relevant application supported by our machinery is to address the certification of URM programs: 
that is, proving that a program meets the specification it is designed for. The example we will be working 
out in this section is the partial subtraction function sub : NxN — N: 



(ai,a 2 ,.-. ,a m )<Edom(f) and f(a l ,a 2 , .. . ,a m 



b 




if m>n 
if m<n 
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An algorithm to make the URM compute this function is the following: if m and n are loaded, respec- 
tively, in R\ and R2, then try to let n reach m by performing Successor operations on /? 2 ; correspondingly 
increment R3, whose content is initially set to 0, to record the number of steps performed on 7? 2 . This 
algorithm devises a loop in the computation, which comes to an end if and only if m>n. In any case, at 
any completion of the loop, the snapshot of the registers content is the following: 

Ri R2 R3 R4 
m n+k k 

The algortithm can be implemented, for example, by the following URM program: 
U = ( 1^/(1,2,5), 2^S(2), 3i-fcS(3), 4i-k/(1,1,1), 5^/(3,1) ) 

The program, as required, is designed to increment in parallel r2 and r 3 and to stop just, and only 
if, when r%=r\. It is then immediate to see that the computations under U may converge or diverge 
depending on the initial configuration: therefore, the implementation of the partial subtraction function 
has to be certified in two steps, by using the predicates cp^ and cp + defined in the previous section. 

On the one hand, we prove via cp x that the computation under U diverges with the configurations 
(li-^m, 2\->n, . . .), such that m<n (which is the "invariant"). To complete the analysis, we establish via 
cp + that the computation under U converges to m—n with the configurations (li-^m, 2h-«, 3i — >0, ...), 
such that m>n (this, in turn, plays the role of the "termination" constraint). 

Theorem 6.1 (Partial minus) Let c=(l 1— ^CTj . 2h >a 2 , 3i— KT3,. ..) be a parameter configuration. Then, 
the implementation of the partial minus function is certified by the following properties: 

1. (Divergence) Oi<(7 2 cpoo(U,0, 1) 

2. (Convergence) ai>a 2 => cp + (U, a, 1, (Ih-^Ci — O2+O3, 2^rG\, 3i — >-Oi — O2+O3, . . •)) 

PROOF. (I.) By structural coinduction on the derivation. Assume the coinductive hypothesis, then 
evaluate the first four instructions so that the control flow loops back to the first instruction, finally apply 
the coinductive hypothesis and prove that the updated configuration satisfies the invariant constraint 1 ^- 

[Oi<Oi] 

Oi<0 2 + l 

[cp m (U,{\^a u 2i-HT 2 +l, 3^a 3 +l,...),l)](i) 
cpco(U,(l>-MJ\, 2h-^a 2 +l, 3^a 3 +l,...),4) 
cp^U, (li-^-ai, 2\->G2+l, 3h-ct 3 ,...),3) 

(w). 

cp m (JJ,{\^a u 2^a 2 , 3h-H7 3 ,...),2) 
cp„(U,(lh^oi, 2^o 2 , 3h-)-03,...),1) 

(introduction) 

Vct=(1i— >cti, 2h>ct 2 , 3i->-03,...). 0\<o 2 => cp^(U,a, 1) 

(i) 

Va=(li->-<7i, 2h->(7 2 , 3i— ^tr 3 , . . .). ai<a 2 => cp^iJJ ,0 ,1) 

I4 See Section[2]about the conventions for displaying cc* c °' lnd top-down proofs in natural deduction style. 
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(2.) By induction on p=0\—02- If p=0, the evaluation of the program U reduces to obeying just the 
first instruction ( the Jump condition is true) and the last one, hence the thesis is immediate. Ifp=q+\, the 
evaluation of the first four instructions causes the control flow to loop back to the first instruction, with 
the configuration ( 1 1 — >■ <Ti , 2i— t-g^+I, 3h >Oj + \, . . .); the thesis follows from the inductive hypothesis. 

Finally, choosing 03=0 implies the convergence of the computation under U with o to 0\—02- 

Inductive versus coinductive evaluations. Regarding partial functions, it is apparent that the two 
predicates cp + and cp„ act as complementary, being the first one responsible for the treatment of the 
elements in the domain of the function involved and the second one for all the extra computations. 

About this separation between inductive and purely coinductive evaluations, we wish to remark that 
it has not been possible to deal with the semantics of URM programs by using a unique, potentially 
coinductive judgment. Actually, by restricting e.g. on abstract programs, if such a predicate was defined 
through the rules (/•/)+, (t-l) + , (/•*")„ and (t-r)^ of Definition 14. II would be too weak. Far from being 
an obstacle for our goals, this fact has caused just to double a part of the encoding, to define both cp + and 
cpoc,; in any case, such a solution provides with an extra proof principle, i.e. the possibility of carrying 
out proofs by structural induction on the derivation of converging computations. 

Nevertheless, these considerations about the relationship between inductive, potential and pure coin- 
ductive evaluation point out the need of further research efforts, along the lines pursued by the much 
more advanced work by Leroy and Grail 11221 (see the next section for the discussion of related work). 

7 Further and related work 

In this document we have given an account of an experiment in cC (Co)Ind , about modeling and reasoning 
on the execution of converging and diverging low-level, assembly-like programs, carried out by the 
Unlimited Register Machine (URM) [12]. The particular perspective which has inspired our research is 
the formalization of a workbench to certify the implementation of the functions computed by the URM; 
as a proof of concept, we have addressed the partial minus function on natural numbers. The encoding 
technique needed to accomplish our goal is quite plain, apart from the use of the coinduction: in fact, 
we have taken most advantage of the (co)inductive specification and proof principles provided by the 
££(Co)ind i n tuitionistic type theory and mechanized in the Coq proof assistant lPT7ll2"6Tl . 

In this final section we sketch some hints to exploit the potential of our formalization, along two 
main directions: computability and traces of execution. 

Computability. In our work we have mastered the very basic computability theory of the URM: essen- 
tially, we are able to prove that specific URM programs implement the functions they are designed for. 
So we have coupled a logic, whose mechanization is supported by Coq, to the bare URM. Nevertheless, 
exploiting the machinery requires a non-trivial analysis and practice by the user, who has to pick out 
ad-hoc properties (termination and invariant conditions) to achieve the certification of URM code. 

At this point, to pursue at a deeper extent the formalization of the computability theory, one has to 
change a bit perspective, gaining a more abstract level. This opens actually two new directions, which 
form the core of the computability: lifting from programs to functions (which they implement) and 
describing "higher-order" methods, to combine such functions for obtaining new, more sophisticated 
computable functions. Therefore, one should add at least a new meta-level, where partial functions are 
first-class citizens. A possible approach towards this goal is to investigate more abstract properties of 
URM programs, such as equivalence. This effort, in turn, would open further research lines, and tends 
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again, as invariance does, to the objective of capturing not only the outcome of the execution of programs, 
but also the observable effects. 

As far as we know, there is no related work about formalizing the historical models used to develop 
the computability theory (and the URM, in particular). We see this as a serious gap from the point of 
view of certified mathematics, a framework where the research is nowadays intense; hence the present 
document is also an effort to contribute closing this gap. 

Traces of execution. Leroy and Grail ll22l adopt coinduction within CC ( c °) Ind to capture both finite and 
infinite evaluations of a call-by-value X -calculus. The motivation of that work is the attempt to describe 
big-step semantics by coinduction, because big-step semantics is more convenient than small-step to 
prove the correctness of program transformations, such as compilation. Nevertheless, big-step semantics 
is traditionally defined by induction, thus allowing to describe only terminating evaluation. 

Grail and Leroy prove that (only) a big-step semantics that separates terminating evaluation (de- 
scribed by an inductive predicate) from diverging evaluation (described by a purely coinductive predi- 
cate) corresponds exactly to finite and non-finite small-step reductions. Afterwards, the authors extend 
both the semantics to produce not only the outcome of an evaluation (convergence and output, or diver- 
gence) but also an execution trace, in the form of a potentially infinite sequence of terms representing the 
intermediate reducts of the source program. This extension is fundamental to establish semantic preser- 
vation properties for program transformation (such as compilation) and is very important to investigate 
observational equivalence for imperative languages. 

Therefore, it would be stimulating to experiment with traces of execution for the URM (for example 
in the form of potential infinite sequences of configurations) to address e.g. equivalence of programs. 

Other work related to divergence or low-level languages. There are several contributions in the 
literature exploiting the potential of coinductive definitions and proofs within cC (Co)Ind to master the 
fundamental concept of non-terminating computation. Some of these approaches concern transition 
systems H0l|3l, linear temporal logic J9j|3l and process algebras lPT8ll20l . 

Finally, from a complementary point of view, we observe that in recent years the metatheory of 
low-level machines has been studied by several authors in more realistic settings fTTll25l l6l. 
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